Search CORE

23 research outputs found

Evaluation of the Importance of Time-Frequency Contributions to Speech Intelligibility in Noise

Author: Hansen John H. L.
Johnson Michael T
Loizou Philipos C.
Wójcicki Kamil K.
Yu Chengzhu
Publication venue: e-Publications@Marquette
Publication date: 01/05/2014
Field of study

Recent studies on binary masking techniques make the assumption that each time-frequency (T-F) unit contributes an equal amount to the overall intelligibility of speech. The present study demonstrated that the importance of each T-F unit to speech intelligibility varies in accordance with speech content. Specifically, T-F units are categorized into two classes, speech-present T-F units and speech-absent T-F units. Results indicate that the importance of each speech-present T-F unit to speech intelligibility is highly related to the loudness of its target component, while the importance of each speech-absent T-F unit varies according to the loudness of its masker component. Two types of mask errors are also considered, which include miss and false alarm errors. Consistent with previous work, false alarm errors are shown to be more harmful to speech intelligibility than miss errors when the mixture signal-to-noise ratio (SNR) is below 0 dB. However, the relative importance between the two types of error is conditioned on the SNR level of the input speech signal. Based on these observations, a mask-based objective measure, the loudness weighted hit-false, is proposed for predicting speech intelligibility. The proposed objective measure shows significantly higher correlation with intelligibility compared to two existing mask-based objective measures

epublications@Marquette

PubMed Central

A new mask-based objective measure for predicting the intelligibility of binary masked speech

Author: Chengzhu Yu
John H L Hansen
Kamil K Wójcicki
P C Loizou
Publication venue
Publication date: 01/01/2013
Field of study

ABSTRACT Mask-based objective speech-intelligibility measures have been successfully proposed for evaluating the performance of binary masking algorithms. These objective measures were computed directly by comparing the estimated binary mask against the ground truth ideal binary mask (IdBM). Most of these objective measures, however, assign equal weight to all time-frequency (T-F) units. In this study, we propose to improve the existing mask-based objective measures by weighting each T-F unit according to its target or masker loudness. The proposed objective measure shows significantly better performance than two other existing mask-based objective measures

CiteSeerX

Nucleotide variants of the cancer predisposing gene CDH1 and the risk of non-syndromic cleft lip with or without cleft palate

Author: A Beeghly-Fadiel
A Letra
A Mostowska
AC Bőhmer
Adam Balcerek
Adrianna Mostowska
Agnieszka Lasota
AR Vieira
B Chen
Barbara Offert
CA Gonzales
CJ Gortardi
D Corso
D Ierodiakonou
DA Popoff
E Taioli
F Cattaneo
F Dudbridge
G Jacobs
GH Sperber
H Naoe
H Pinheiro
H Rafighdoost
I Kluijt
IP Vogelaar
Izabella Dunin-Wilczyńska
JC Fierro-Gonzalez
Kamil K. Hozyasz
LC Li
MJ Dixon
MP Stemmler
Paweł P. Jagodziński
Piotr Wójcicki
PR Benusiglio
RC Fitzgerald
SJ Lubbe
T Frebourg
Y Song
Y Wang
Z Zhan
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Preference for 20-40 ms window duration in speech analysis

Author: James G. Lyons
Kamil K. Wójcicki
Kuldip K. Paliwal
Publication venue
Publication date: 29/03/2011
Field of study

In speech processing the short-time magnitude spectrum is believed to contain most of the information about speech intelligibility and it is normally computed using the short-time Fourier transform over 20-40 ms window duration. In this paper, we investigate the effect of the analysis window duration on speech intelligibility in a systematic way. For this purpose, both subjective and objective experiments are conducted. The subjective experiment is in a form of a consonant recognition task by human listeners, whereas the objective experiment is in a form of an automatic speech recognition (ASR) task. In our experiments various analysis window durations are investigated. For the subjective experiment we construct speech stimuli based purely on the short-time magnitude information. The results of the subjective experiment show that the analysis window duration of 15–35 ms is the optimum choice when speech is reconstructed from the short-time magnitude spectrum. Similar conclusions were made based on the results of the objective (ASR) experiment. The ASR results were found to have statistically significant correlation with the subjective intelligibility results. Index Terms — Analysis window duration, magnitude spectrum, automatic speech recognition, speech intelligibility 1

CiteSeerX

Crossref

The Effect of the Additivity Assumption on Time and Frequency Domain Wiener Filtering for Speech Enhancement

Author: Kamil K. Wójcicki
Kuldip K. Paliwal
Stephen So
Publication venue
Publication date
Field of study

In this paper, we investigate the validity of the common assumption made in Wiener filtering that the clean speech and noise signals are uncorrelated under short-time analysis typically used for speech enhancement. In order to achieve this we have performed speech enhancement experiments, where speech corrupted by additive white Gaussian noise is enhanced by a Wiener filter designed in the time as well as the frequency domains. Results of oracle-style experiments confirm that the inclusion of the additivity assumption in Wiener filtering results in negligible degradation of enhanced speech quality. Informal listening tests show that the background noise resulting from time domain enhancement to be more tolerable than the background noise resulting from frequency domain framework. Index Terms: Wiener filtering, speech enhancement 1

CiteSeerX

Recommended from our members

A mutation in mouse Pak1ip1 causes orofacial clefting while human PAK1IP1 maps to 6p24 translocation breaking points associated with orofacial clefting.

Author: Adam P Ross
Adrianna Mostowska
Beth Davidson
Iannis E Adamopoulos
Jeffrey C Murray
Kamil K Hozyasz
Konstantinos S Zarbalis
M Adela Mansilla
Piotr Wójcicki
Richard Sturm
Roy L Maute
Samuel J Pleasure
Scott R May
Simon Helminski
Youngshik Choe
Publication venue: Public Library of Science (PLoS)
Publication date: 01/01/2013
Field of study

Orofacial clefts are among the most common birth defects and result in an improper formation of the mouth or the roof of the mouth. Monosomy of the distal aspect of human chromosome 6p has been recognized as causative in congenital malformations affecting the brain and cranial skeleton including orofacial clefts. Among the genes located in this region is PAK1IP1, which encodes a nucleolar factor involved in ribosomal stress response. Here, we report the identification of a novel mouse line that carries a point mutation in the Pak1ip1 gene. Homozygous mutants show severe developmental defects of the brain and craniofacial skeleton, including a median orofacial cleft. We recovered this line of mice in a forward genetic screen and named the allele manta-ray (mray). Our findings prompted us to examine human cases of orofacial clefting for mutations in the PAK1IP1 gene or association with the locus. No deleterious variants in the PAK1IP1 gene coding region were recognized, however, we identified a borderline association effect for SNP rs494723 suggesting a possible role for the PAK1IP1 gene in human orofacial clefting

Columbia University Academic Commons

Directory of Open Access Journals

PubMed Central

eScholarship - University of California